Three-dimensional reconstruction method, apparatus and system, model training method and storage medium

ABSTRACT

A three-dimensional reconstruction method includes: acquiring first images captured by a first camera, each of the first images being an image containing a target object; determining a shooting angle of the each of the first images, the shooting angle being configured to characterize a shooting direction relative to the target object when the first camera shoots the first images; determining an angle interval corresponding to the each of the first images from a plurality of angle intervals included in an angle range [0, 360°) based on the shooting angle, and setting the first image as a target image in the angle interval; and three-dimensionally reconstructing the target object based on the target images in the respective angle intervals to obtain a three-dimensional image of the target object.

CROSS-REFERENCE TO RELATED APPLICATION

The present disclosure is a 371 of PCT Application No.PCT/CN2020/077850, filed on Mar. 4, 2020, which claims priority toChinese Patent Application No. 201910333474.0, filed on Apr. 24, 2019and entitled “THREE-DIMENSIONAL RECONSTRUCTION METHOD, APPARATUS ANDSYSTEM, MODEL TRAINING METHOD AND STORAGE MEDIUM”, the entire contentsof which are incorporated herein by reference.

TECHNICAL FIELD

Embodiments of the present disclosure relate to a three-dimensionalreconstruction method, apparatus and system, and a model training methodand a storage medium thereof.

BACKGROUND

Some multi-functional fitting mirrors may provide a fitting effect inthe case that a user does not actually try on clothes, which improvesthe convenience of fitting and saves the fitting time. Before thefitting mirror is used, a three-dimensional image of a target object (ingeneral, a user who uses the fitting mirror) needs to be acquired. Thethree-dimensional image is usually obtained by three-dimensionallyreconstructing target images containing the target object acquired by acamera at various shooting angles.

SUMMARY

Various embodiments of the present disclosure provide athree-dimensional reconstruction method. The method includes:

acquiring first images captured by a first camera, each of the firstimages being an image containing a target object;

determining a shooting angle of the each of the first images, theshooting angle being configured to characterize a shooting directionrelative to the target object when the first camera shoots the firstimages;

determining an angle interval corresponding to the each of the firstimages from a plurality of angle intervals included in an angle range[0, 360°) based on the shooting angle, and setting the first image as atarget image in the angle interval; and

three-dimensionally reconstructing the target object based on the targetimages in the respective angle intervals to obtain a three-dimensionalimage of the target object.

Optionally, determining the shooting angle of the each of the firstimages includes:

inputting the first image into an angle recognition model;

receiving angle information outputted by the angle recognition model;and

determining the angle information as the shooting angle;

wherein the angle recognition model is a model obtained by learning andtraining sample images and shooting angles of the sample images.

Optionally, after acquiring the first images captured by the firstcamera, the method further includes:

assigning a tag to the each of the first images, the tag beingconfigured to mark the target object in the target image; and

classifying the first images containing the target object into an imageset based on the tag.

prior to three-dimensionally reconstructing the target object based onthe target images in the respective angle intervals to obtain thethree-dimensional image of the target object, the method furtherincludes:

acquiring the first images corresponding to the respective angleintervals based on the image set.

Optionally, after determining the angle interval corresponding to theeach of the first images from the plurality of angle intervals includedin the angle range [0, 360°) based on the shooting angle, the methodfurther includes:

identifying, for each of the plurality of angle intervals, whether theimage set includes a plurality of first images corresponding to theangle interval;

quality-scoring the plurality of first images to obtain an image qualityscore of the each of the first images in the plurality of first images,in response to identifying that the image set includes the plurality offirst images corresponding to the angle interval; and

reserving the first image with the highest image quality score, anddeleting the remaining first images.

Optionally, after acquiring the first images captured by the firstcamera, the method further includes:

identifying whether a resolution of the each of the first images is lessthan a resolution threshold;

deleting the first image in response to identifying that the resolutionof the first image is less than the resolution threshold; and

modifying the first image as an image with a specified resolution inresponse to identifying that the resolution of the first image is notless than the resolution threshold, wherein the specified resolution isgreater than or equal to the resolution threshold.

Optionally, three-dimensionally reconstructing the target object basedon the target images in the respective angle intervals to obtain thethree-dimensional image of the target object includes:

three-dimensionally reconstructing, when the each of the plurality ofangle intervals is provided with a corresponding target image, thetarget object based on the target image in each of the angle intervalsto obtain a three-dimensional image of the target object.

Optionally, three-dimensionally reconstructing the target object basedon the target images in the respective angle intervals to obtain thethree-dimensional image of the target object includes:

acquiring, when the three-dimensional reconstruction instruction isreceived, a plurality of first images containing the target object basedon information of the target object carried by a three-dimensionalreconstruction instruction;

three-dimensionally reconstructing the target object based on the eachof the first images to obtain a three-dimensional image of the targetobject;

identifying whether the three-dimensional image is an incompletethree-dimensional image; and

repairing the incomplete three-dimensional image in response toidentifying that the three-dimensional image is the incompletethree-dimensional image to obtain a repaired three-dimensional image.

A plurality of embodiments of the present disclosure provides athree-dimensional reconstruction apparatus. The apparatus includes:

a first acquiring module, configured to acquire first images captured bya first camera, each of the first images containing a target object;

a first determining module, configured to determine a shooting angle ofthe each of the first images by using an angle recognition model, theshooting angle being configured to characterize a shooting directionrelative to the target object when the first camera shoots the firstimages, and the angle recognition model being a model obtained bylearning and training sample images and shooting angles of the sampleimages;

a second determining module, configured to determine an angle intervalcorresponding to the each of the first images from a plurality of angleintervals included in an angle range [0, 360°) based on the shootingangle of the first image, and set the first image as a target image inthe angle interval; and

a three-dimensional reconstructing module, configured tothree-dimensionally reconstructing the target object based on the targetimages in the respective angle intervals to obtain a three-dimensionalimage of the target object.

Various embodiments of the present disclosure provide athree-dimensional reconstruction system. The system includes areconstruction server and a first camera, wherein the reconstructionserver includes the above-mentioned three-dimensional reconstructionapparatus.

Optionally, the system further includes a fitting mirror, wherein

the fitting mirror is configured to, when a target object is detected,send an acquisition request to the reconstruction server, theacquisition request carrying information of the target object; and

the reconstruction server is configured to send an acquisition responseto the fitting mirror based on the information of the target object, theacquisition response carrying a three-dimensional image of the targetobject.

Various embodiments of the present disclosure provide a model trainingmethod, configured to train an angle recognition model. The methodincludes:

training a plurality of times until the accuracy of classifying shootingangles of sample images in a training image set by the angle recognitionmodel reaches a predetermined threshold, wherein the training includes:

acquiring sample images containing a sample object captured by a secondcamera and a depth map corresponding to each of the sample images;

acquiring a first key point and a second key point of the sample objectfrom the depth map;

determining a shooting angle of the each of the sample images based onthree-dimensional coordinates of the first key point andthree-dimensional coordinates of the second key point, the shootingangle being configured to characterize a direction relative to thesample object when the second camera shoots the sample images; and

inputting the sample image into a deep learning model to obtain apredicted shooting angle of the sample image, and determining aclassification accuracy of the shooting angle according to the shootingangle and the predicted shooting angle of the sample image.

Optionally, determining the shooting angle of the each of the sampleimages based on the three-dimensional coordinates of the first key pointand the three-dimensional coordinates of the second key point includes:

calculating a shooting angle of the sample image by using an anglecalculation formula, wherein the angle calculation formula is:

$\quad\left\{ \begin{matrix}{V_{1} = \left( {{x_{2} - x_{1}},\ {z_{2} - z_{1}}} \right)} \\{{V_{2} \times V_{1}} = 0} \\{{\alpha = {\arccos\left( \frac{V_{2} \times V_{Z}}{{V_{2}} \times {V_{Z}}} \right)}}\ ;} \\{{V_{2}} = {{V_{Z}} = 1}}\end{matrix} \right.$

wherein the three-dimensional coordinates of the first key point are(x₁, y₁, z₁), and the three-dimensional coordinates of the second keypoint are (x₂, y₂, z₂); V₁ represents a vector of a connection linebetween the first key point and the second key point in an XZ plane in aworld coordinate system; V₂ represents a unit vector perpendicular toV₁; V₂ represents a unit vector parallel to the Z axis in the worldcoordinate system; and a represents the shooting angle.

Optionally, after determining the shooting angle of the each of thesample images based on the three-dimensional coordinates of the firstkey point and the three-dimensional coordinates of the second key point,the training further includes:

identifying, based on the sample image, whether an orientation postureof the sample object relative to the second camera is a back-facingorientation posture; and

correcting the shooting angle by using a correction calculation formula,when the orientation posture of the sample object relative to the secondcamera is the back-facing orientation posture to obtain a correctedshooting angle, wherein the correction calculation formula is:

α1=α2+180°;

wherein α1 is the corrected shooting angle; and α2 is the shooting anglebefore correction.

Optionally, prior to determining the shooting angle of each of thesample images based on the three-dimensional coordinates of the firstkey point and the three-dimensional coordinates of the second key point,the training further includes:

identifying whether a distance between the first key point and thesecond key point is less than a distance threshold; and

determining that the shooting angle of the sample image is a specifiedangle in response to identifying that the distance between the first keypoint and the second key point is less than the distance threshold,wherein the specified angle is any angle within an angle interval of afixed range.

Various embodiments of the present disclosure provide a non-volatilecomputer-readable storage medium storing at least one code instructiontherein, wherein the at least one code instruction, when executed by aprocessor, enables the processor to perform the above-mentionedthree-dimensional reconstruction method.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to illustrate technical solutions in embodiments of the presentdisclosure more clearly, a brief introduction of the drawings used inthe embodiments will be provided herein. Obviously, the drawingsdescribed below are merely some embodiments of the present disclosure, aperson ordinary skill in the art can also obtain other drawingsaccording to these drawings without creative work.

FIG. 1 is a block diagram of a three-dimensional reconstruction systeminvolved in a three-dimensional reconstruction method according to anembodiment of the present disclosure;

FIG. 2 is a block diagram of a model training system involved in a modeltraining method according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of a three-dimensional reconstruction methodaccording to an embodiment of the present disclosure;

FIG. 4 is an effect diagram when a first camera shoots a target object;

FIG. 5 is a flowchart of a three-dimensional reconstruction methodaccording to another embodiment of the present disclosure;

FIG. 6 is a flowchart of a three-dimensional reconstruction methodaccording to yet another embodiment of the present disclosure;

FIG. 7 is a flowchart of a training process according to an embodimentof the present disclosure;

FIG. 8 is a schematic diagram of a joint point of a sample objectaccording to an embodiment of the present disclosure;

FIG. 9 is a block diagram of a three-dimensional reconstructionapparatus according to an embodiment of the present disclosure;

FIG. 10 is a block diagram of a three-dimensional reconstructionapparatus according to another embodiment of the present disclosure; and

FIG. 11 is a block diagram of a three-dimensional reconstructionapparatus according to yet another embodiment of the present disclosure.

DETAILED DESCRIPTION

For clearer descriptions of the objects, technical solutions andadvantages of the present disclosure, embodiments of the presentdisclosure will be described in detail below in combination with theaccompanying drawings.

Referring to FIG. 1, FIG. 1 is a block diagram of a three-dimensionalreconstruction system involved in a three-dimensional reconstructionmethod according to an embodiment of the present disclosure. Thethree-dimensional reconstruction system 100 may include at least onefirst camera 101 and a reconstruction server 102.

The first camera 101 may generally be a surveillance camera including anRGB camera lens, an infrared camera lens or the like. In general, aplurality of first cameras 101 are provided. The plurality of firstcameras 101 may be deployed at different locations in a mall or shop.The reconstruction server 102 may be a server, or a server clustercomposed of several servers, or a cloud computing server center, or acomputer device. Each of the first cameras 101 may establish acommunication connection with the reconstruction server 102.

Optionally, the three-dimensional reconstruction system 100 may furtherinclude a fitting mirror 103. The fitting mirror 103 may generally bedeployed in a store such as a clothing store. The fitting mirror 103 mayprovide users with a virtual fitting service. The fitting mirror 103 mayestablish a communication connection with the reconstruction server 102.

Referring to FIG. 2, FIG. 2 is a block diagram of a model trainingsystem involved in a model training method provided by an embodiment ofthe present disclosure. The model training system 200 may include asecond camera 201 and a training server 202.

The second camera 201 may be a camera including a depth camera lens, ormay be a binocular camera. The second camera may acquire a color map(also called an RGB map) or a depth map. A pixel value of each pixel inthe depth map is a depth value, wherein the depth value is used toindicate a distance of the corresponding pixel from the second camera.The training server 202 may be a server, or a server cluster composed ofseveral servers, or a cloud computing server center, or a computerdevice. The second camera 201 may wiredly or wirelessly establish acommunication connection with the training server 202.

Referring to FIG. 3, FIG. 3 is a flowchart of a three-dimensionalreconstruction method according to an embodiment of the presentdisclosure. The three-dimensional reconstruction method is applicable tothe reconstruction server 102 in the three-dimensional reconstructionsystem 100 shown in FIG. 1. The three-dimensional reconstruction methodmay include the following steps.

In step S301, first images captured by a first camera are acquired. Eachof the first images contains a target object.

In step S302, a shooting angle of the each of the first images isdetermined, wherein the shooting angle is configured to characterize ashooting direction relative to the target object when the first camerashoots the first images. In some embodiments of the present disclosure,the shooting angle of each of the first images may be determined byusing an angle recognition model, and the angle recognition model is amodel obtained by learning and training sample images and shootingangles of the sample images.

In step S303, an angle interval corresponding to the each of the firstimages is determined from a plurality of angle intervals included in anangle range [0, 360°) based on the shooting angle, and the first imageis set as a target image in the angle interval.

In the embodiment of the present disclosure, the plurality of angleintervals are obtained by dividing the angle range [0, 360°), and theangle value contained in the each of the angle intervals is different.

Exemplarily, referring to FIG. 4, FIG. 4 is an effect diagram when thefirst camera shoots a target object. Taking a target object 01 as thecenter, a first camera 02 shoots the target object 01 from differentdirections. The shooting directions when the first camera 02 shoots thetarget object 01 may be characterized by shooting angles. For example,one angle interval is defined by counterclockwise rotation with thetarget object 01 as the center at an interval of 15°. Angle informationincluded in the each of the angle intervals refers to the shootingangles obtained in step S302. Therefore, the shooting angles acquired instep S302 fall within one of the plurality of angle intervals.

In step S304, the target object is three-dimensionally reconstructedbased on the target images in the respective angle intervals to obtain athree-dimensional image of the target object.

In the related art, in order to acquire a three-dimensional image of atarget object, it is necessary to acquire target images containing thetarget object at various shooting angles by a camera lens. Since thetarget images acquired by the camera lens lack information aboutshooting angles of the target images, it is necessary to sort the targetimages according to a shooting order of the respective target images toensure that two angle intervals corresponding to two adjacent images areadjacent. This method requires a large calculation amount forthree-dimensional reconstruction, resulting in relatively low efficiencyin acquiring a three-dimensional image.

However, in the embodiment of the present disclosure, the shooting angleof the each of the first images may be determined by the anglerecognition model, and the angle interval corresponding to the each ofthe first images in the plurality of angle intervals may be determinedbased on these shooting angles. Therefore, when the first images areacquired, the shooting angles of the first images may also be acquired.In the subsequent three-dimensional reconstruction of the target object,there is no need to use additional algorithms to sort the plurality offirst images, and the order of the plurality of first images may beacquired directly based on the shooting angles, thereby effectivelyreducing the calculation amount during three-dimensional reconstructionand improving the efficiency of acquiring the three-dimensional image.

In summary, in the three-dimensional reconstruction method according tothe embodiment of the present disclosure, the shooting angle of the eachof the first images can be determined by the angle recognition model.The angle interval corresponding to the each of the first images in theplurality of angle intervals can be determined based on the shootingangles, and the first images are set as the target images in the angleinterval. Subsequently, the target object can be three-dimensionallyreconstructed based on the target images in the respective angleintervals to obtain the three-dimensional image of the target object.The shooting angle of the each of the first images can also be acquiredwhen the first images are acquired. In the subsequent three-dimensionalreconstruction of the target object, there is no need to use additionalalgorithms to sort a plurality of first images, and the order of theplurality of first images may be acquired directly based on the shootingangles, thereby effectively reducing the calculation amount duringthree-dimensional reconstruction and improving the efficiency ofacquiring the three-dimensional image.

Referring to FIG. 5, FIG. 5 is a flowchart of a three-dimensionalreconstruction method according to another embodiment of the presentdisclosure. The three-dimensional reconstruction method is applicable tothe reconstruction server 102 in the three-dimensional reconstructionsystem 100 shown in FIG. 1. The three-dimensional reconstruction methodmay include the following steps.

In step S401, first images captured by a first camera are acquired.

Each of the first images is an image containing a target object capturedby the first camera. The target object may be a person, an animal, or anobject. In the case that the first camera is a surveillance camera andis deployed in a mall or store, the target object is a person in themall or store.

In the embodiment of the present disclosure, assuming that an imagecaptured by the first camera contains a plurality of different targetobjects, the reconstruction server may intercept a plurality of firstimages in this image, wherein target object in the each of the firstimages is different.

In step S402, whether a resolution of each first image is less than aresolution threshold is identified.

Exemplarily, step S403 is performed in response to identifying that theresolution of the first image is less than the resolution threshold; andstep S404 is performed in response to identifying that the resolution ofthe first image is not less than the resolution threshold. For example,the resolution threshold is 224×112.

In step S403, the first image is deleted in response to identifying thatthe resolution of the first image is less than the resolution threshold.

In some embodiments of the present disclosure, in response toidentifying that the resolution of the acquired first image isrelatively low, a display effect of a three-dimensional image obtainedafter subsequent three-dimensional reconstruction is relatively poor.Therefore, the first images with relatively low resolutions may bedeleted before the three-dimensional reconstruction. Exemplarily, thereconstruction server may identify whether the resolution of the each ofthe first images is less than the resolution threshold. The first imageis deleted in response to identifying that the reconstruction serveridentifies that the resolution of the first image is less than theresolution threshold.

In step S404, the first image is modified as an image with a specifiedresolution in response to identifying that the resolution of the firstimage is not less than the resolution threshold. The specifiedresolution is greater than or equal to the resolution threshold.

In some embodiments of the present disclosure, in response toidentifying that the resolution of the first image is not less than theresolution threshold, the first image may be modified as an image with aspecified resolution to facilitate subsequent three-dimensionalreconstruction. Exemplarily, in response to identifying that theresolution of the first image is greater than the specified resolution,the reconstruction server needs to compress the resolution of the firstimage to the specified resolution; and in response to identifying thatthe resolution of the first image is less than the specified resolution,the reconstruction server needs to expand the resolution of the firstimage to the specified resolution.

In step S405, a tag is assigned to the each of the first images, whereinthe tag is configured to mark a target object in the first image.

For example, the first images may be marked by using a targetrecognition algorithm. In some embodiments of the present disclosure,assuming that the first camera is deployed in a store or mall and thetarget object is a person in the store or mall, the target recognitionalgorithm may be a pedestrian movement route detection algorithm. Thereconstruction server may mark the each of the first images with a tagby the pedestrian movement route detection algorithm, wherein the tag isconfigured to mark the target object in the first image. Exemplarily,the pedestrian movement route detection algorithm may be used to analyzeat least one of a clothing feature, a face feature, and a morphologicalfeature of the target object, so as to mark the each of the first imagewith a tag.

In step S406, the first images are classified into an image set based onthe tag.

In the embodiment of the present disclosure, the first images areprovided with the same tag, and contain the same target object.Therefore, the reconstruction server may classify the target imagescontaining the same target object into an image set based on the tag.

It should be noted that, after the first images are classified in stepS406, the target objects in the first images in the same image set arethe same, but the target objects in the first images in different imagesets are different.

In step S407, a shooting angle of the each of the first images isdetermined by using an angle recognition model.

In the embodiment of the present disclosure, the reconstruction servermay determine the shooting angle of the each of the first images byusing the angle recognition model. The angle recognition model is amodel obtained by learning and training sample images and shootingangles of the sample images. A method for acquiring the anglerecognition model will be introduced in the subsequent embodiments, andis not repeated here.

Exemplarily, determining the shooting angle of the each of the firstimages through the reconstruction server by using the angle recognitionmodel may include the following steps.

In step A1, the first image is inputted into the angle recognitionmodel.

In step B1, angle information outputted by the angle recognition modelis received.

In step C1, the angle information is determined as the shooting angle ofthe first image.

In step S408, an angle interval corresponding to the each of the firstimages is determined based on the shooting angle of the first image. Theangle interval refers to an angle interval among a plurality of angleintervals included in an angle range [0, 360°), and the first image isdetermined as a target image in the angle interval.

In the embodiment of the present disclosure, assuming that the firstcamera is arranged in a store or mall, and the target object is a personin the store or mall, the reconstruction server may acquire the firstimages containing the target object captured by the first camera atdifferent shooting angles in real time during the walking process of thetarget object. Exemplarily, taking the target object as the center, oneangle interval is defined by clockwise or counterclockwise rotation atan interval of 15°. The number of the plurality of angle intervals is24. The reconstruction server may determine the angle intervalcorresponding to the first image from a plurality of angle intervalsbased on the shooting angle of the each of the first images, and set thefirst image as a target image in the angle interval. For example,assuming that the shooting angle of the first image is 10°, the angleinterval corresponding to the first image is [0, 15°), and the firstimage is set as a target image within an angle interval of [0, 15°). Thetarget images are used for three-dimensional reconstruction.

In step S409, for the each of the plurality of angle intervals, whetherthere are more than two first images in the same image set correspondingto the angle interval is identified based on the image set.

In the embodiment of the present disclosure, since the reconstructionserver subsequently needs to three-dimensionally reconstruct the targetobject by reference to the target image corresponding to each of theangle intervals, more than two target images containing the same targetobject corresponding to one angle interval may be present. In responseto three-dimensionally reconstructing the target object directly byreference to a plurality of first images corresponding to the angleinterval and containing the same target object, the efficiency ofthree-dimensionally reconstructing the target object may be affected.Therefore, the reconstruction server may identify whether more than twotarget images are present in the same image set corresponding to oneangle interval based on the image set, i.e., identifying whether morethan two target images containing the same target object correspondingto one angle interval are present.

In some embodiments in the present disclosure, for each of the angleintervals, the reconstruction server may identify whether there are morethan two target images corresponding to the angle interval in the sameimage set. In the case that more than two target images are present inthe same image set corresponding to the angle interval, that is, morethan two target images containing the same target object correspondingto the angle interval are present, step S410 is performed; in the casethat no more than two target images are present in the same image setcorresponding to the angle interval, step S409 is repeated.

In step S410, in the case that more than two first images are present inthe same image set corresponding to the angle interval, more than twofirst images in in the same image set quantity-scored to obtain an imagequality score of the each of the first images.

Exemplarily, in the case that more than two first images are present inthe same image set corresponding to the angle interval, thereconstruction server may quality-score more than two first imagecorresponding to the angle interval by using an image quality scoringalgorithm to obtain a quality score of the each of the first images.

In step S411, the first image with a highest image quality score isreserved and set as the target image in the angle interval, and theremaining first images are deleted.

In the embodiment of the present disclosure, a higher quality scoreindicates a higher definition of the first image. The first image is setas the target image in the corresponding angle interval, and the qualityof the three-dimensional image obtained during subsequentthree-dimensional reconstruction based on the target image is better.Therefore, the reconstruction server may reserve the first image withthe highest image quality score, and delete the remaining first images,so as to ensure that each of the angle intervals corresponds to only onefirst image with relatively higher definition, thereby effectivelyimproving the imaging quality of the three-dimensional image obtained bythe subsequent three-dimensional reconstruction of the target object,reducing the number of the first images that needs to be processedduring the three-dimensional reconstruction, and thus improving theefficiency of the three-dimensional reconstruction of the target object.

In step S412, the target images are three-dimensionally reconstructedbased on the target images in the respective angle intervals to obtain athree-dimensional image of the target object.

It should be noted that the first images containing the same targetobject belong to the same image set. Therefore, prior to step S412, thethree-dimensional reconstruction method may further include: acquiringfirst images corresponding to the respective angle intervals andcontaining the same target object based on the image set. It should alsobe noted that, after obtaining the three-dimensional image of the targetobject, the reconstruction server may store the three-dimensional imagein a memory of the reconstruction server.

In the embodiment of the present disclosure, the reconstruction servermay three-dimensionally reconstruct the target object that meetsthree-dimensional reconstruction conditions. There are variousconditions for the three-dimensional reconstruction. The embodiments ofthe present disclosure are schematically illustrated in the followingtwo optional examples.

In the first optional example, when the reconstruction server determinesthe target images corresponding to each of the plurality of angleintervals, the target object meets the three-dimensional reconstructionconditions.

In this case, step S412 may include: in the case that each of theplurality of angle intervals is provided with a target object, thetarget object is three-dimensionally reconstructed based on the targetimage in the each of the angle intervals to obtain a three-dimensionalimage of the target object. Exemplarily, the reconstruction server maythree-dimensionally reconstruct the target object by using a structurefrom motion (SFM) algorithm to obtain a three-dimensional image of thetarget object.

In the embodiment of the present disclosure, the process of thereconstruction server to determine that the target images containing thesame target object correspond to each of the plurality of angleintervals may include the following steps.

In step A2, for each image set, the angle interval corresponding to theeach of the first images in the image set is acquired.

In the embodiment of the present disclosure, after steps S401 to S411,the reconstruction server may determine the angle interval correspondingto the each of the first images in each image set. In addition, in thesame image set, the each of the angle intervals corresponds to only onefirst image, and this first image is set as a target image in this angleinterval. Therefore, for each image set, the reconstruction server mayacquire the angle interval corresponding to the each of the first imagesin the image set in real time.

In step B2, whether the number of angle intervals corresponding to alltarget images is the same as the number of the plurality of angleintervals is identified.

Exemplarily, the reconstruction server may determine that each of theplurality of angle intervals is provided with a target image in responseto identifying that the number of the angle intervals corresponding toall target images is the same as the number of the plurality of angleintervals, that is, step C2 is performed. The reconstruction server maydetermine that at least one of the plurality of angle intervals is notprovided with a target image in response to identifying that the numberof angle intervals corresponding to all target images is different fromthe number of plurality of angle intervals, that is, step A1 isrepeated.

In step C2, it is determined that each of the plurality of angleintervals is provided with the target image in the case that the numberof all target images is the same as the number of the plurality of angleintervals.

In the embodiment of the present disclosure, after determining that eachof the plurality of angle intervals is provided with the target image,the reconstruction server may determine that the target object isprovided with target images in each of the plurality of angle intervals,wherein the target object meets the three-dimensional reconstructionconditions.

In the second optional implementation, when the reconstruction serverreceives a three-dimensional reconstruction instruction carryinginformation of the target object, the target object meets thethree-dimensional reconstruction conditions.

In this case, step S412 may include the following steps.

In step A3, a plurality of first images containing the target object areacquired based on the information of the target object carried by thethree-dimensional reconstruction instruction when the three-dimensionalreconstruction instruction is received.

Exemplarily, the three-dimensional reconstruction system may furtherinclude a fitting mirror. The three-dimensional reconstructioninstruction may be an instruction sent by the fitting mirror. In theembodiment of the present disclosure, when the reconstruction serverreceives the three-dimensional reconstruction instruction carrying theinformation of the target object, the reconstruction server may acquirea plurality of first images containing the target object based on theinformation of the target object.

For example, the information about the target object may include atleast one of a clothing feature, a face feature, and a morphologicalfeature. Since the three-dimensional reconstruction server may alsoanalyze at least one of the clothing feature, the face feature, and themorphological feature of the target image after acquiring the firstimages, the reconstruction server may acquire the plurality of firstimages containing the target object based on the information of thetarget object.

In step B3, based on the each of the first images, the target objectcorresponding to this first image is three-dimensionally reconstructedto obtain a three-dimensional image of the target object. Based on theplurality of first images, angle intervals corresponding to theplurality of first images are determined, and the plurality of firstimages are determined as target images in the corresponding angleintervals, respectively.

Based on the target images of the corresponding angle intervals, thetarget images are three-dimensionally reconstructed obtain athree-dimensional image of the target object.

Exemplarily, the reconstruction server may three-dimensionallyreconstruct the target object based on the each of the first imagescontaining the target object by using the SFM algorithm to obtain athree-dimensional image of the target object.

In step C3, whether the three-dimensional image is an incompletethree-dimensional image is identified.

In the embodiment of the present disclosure, in view of a small numberof the first images based on the reconstruction server during thethree-dimensional reconstruction of the target object, that is, apossibility that at least one of the plurality of angle intervals is notprovided with the first images containing the target object, thethree-dimensional image obtained after three-dimensional reconstructionmay be an incomplete three-dimensional image, for example, thethree-dimensional image contains angle of voids. The reconstructionserver may identify whether the three-dimensional image is an incompletethree-dimensional image. Step D3 is performed in response to identifyingthat the three-dimensional image is the incomplete three-dimensionalimage; and this process ends in response to identifying that thethree-dimensional image is a complete three-dimensional image.

In step D3, the incomplete three-dimensional image is repaired to obtaina repaired three-dimensional image in response to identifying that thethree-dimensional image is an incomplete three-dimensional image.

In the embodiment of the present disclosure, in order to be able toacquire a three-dimensional image with a relatively high image quality,the reconstruction server, after identifying that the three-dimensionalimage is the incomplete three-dimensional image, needs to repair theincomplete three-dimensional image to obtain a repairedthree-dimensional image. For example, assuming that a target object tobe reconstructed is a human-like target object, the reconstructionserver may repair the three-dimensional image according to the law of athree-dimensional image of the human body.

It should be noted that the sequence of the steps of thethree-dimensional reconstruction method provided by the embodiment ofthe present disclosure may be appropriately adjusted. For example, step407 may be performed first, followed by step 405 to step 406. The stepsmay also be deleted or added according to the situation. Within thetechnical scope disclosed in the present disclosure, any variations ofthe method easily derived by a person skilled in the art shall fallwithin the protection scope of the present disclosure, which is notrepeated herein.

In summary, in the three-dimensional reconstruction method according tothe embodiment of the present disclosure, the shooting angle of each ofthe target images can be determined by the angle recognition model. Theangle interval of the each of the first images can be determined fromthe plurality of angle intervals based on the shooting angles, and thefirst images are set as the target images in this angle interval.Subsequently, the target object can be three-dimensionally reconstructedbased on the target images in respective angle intervals to obtain thethree-dimensional image of the target object. When the first images areacquired, the shooting angle of the each of the first images can also beacquired. In the subsequent three-dimensional reconstruction of thetarget object, there is no need to use additional algorithms to sort aplurality of target images, and the order of the plurality of firstimages can be acquired directly based on the shooting angles, therebyeffectively reducing the calculation amount during three-dimensionalreconstruction and improving the efficiency of acquiring thethree-dimensional image.

Referring to FIG. 6, FIG. 6 is a flowchart of a three-dimensionalreconstruction method according to yet another embodiment of the presentdisclosure. The three-dimensional reconstruction method is applicable tothe three-dimensional reconstruction system 100 shown in FIG. 1. Thethree-dimensional reconstruction method may include the following steps.

In step S501, a first camera captures images.

In the embodiment of the present disclosure, the first camera may be asurveillance camera. A plurality of first cameras may be provided anddeployed at different locations in a mall or a store.

In step S502, each of the first cameras sends the captured images to areconstruction server.

In the embodiment of the present disclosure, the each of the firstcameras may send the real-time captured images to the reconstructionserver, such that the reconstruction server may three-dimensionallyreconstruct a target object.

In step S503, the reconstruction server three-dimensionally reconstructsthe target object based on the images captured by the first cameras toobtain a three-dimensional image of the target object.

It should be noted that the process of the reconstruction server tothree-dimensionally reconstruct the target object based on the imagescaptured by the first cameras to obtain the three-dimensional image ofthe target object may refer to the relevant content in steps S401 to412, which is not repeated herein.

In step S504, a fitting mirror sends an acquisition request to thereconstruction server, the acquisition request carrying information ofthe target object.

In the embodiment of the present disclosure, it is assumed that thetarget object is a person standing in front of the fitting mirror. Inorder to provide a virtual fitting service for the target object, thefitting mirror needs to acquire the three-dimensional image of thetarget object from the reconstruction server.

Exemplarily, a fitting mirror camera may be provided in the fittingmirror. The fitting mirror camera may capture information of the targetobject located in front of the fitting mirror, and send an acquisitionrequest to the reconstruction server, the acquisition request carryingthe information of the target object.

In step S505, the reconstruction server sends an acquisition response tothe fitting mirror based on the information of the target object, theacquisition response carrying the three-dimensional image of the targetobject.

In some embodiments of the present disclosure, the reconstructionserver, after receiving the acquisition request that carries theinformation of the target object, may first identify, based on theinformation of the target object, whether the three-dimensional image ofthe target object is stored.

In some embodiments of the present disclosure, the information of thetarget object may be face information. The reconstruction server mayalso acquire the face information of the target object in the event ofacquiring the three-dimensional image of the target object. Therefore,the reconstruction server may identify, based on the face information,whether the three-dimensional image of the target object is stored.

In response to identifying that the three-dimensional image of thetarget object is stored, for example, in the case that the faceinformation of the target object carried in the acquisition request isidentical with the face information corresponding to thethree-dimensional image stored by the reconstruction server, thereconstruction server may determine that the three-dimensional image ofthe target object is stored. In this case, the reconstruction server maysend an acquisition response carrying the three-dimensional image of thetarget object to the fitting mirror.

In response to identifying that the three-dimensional image of thetarget object is not stored, for example, the face informationcorresponding to all three-dimensional images stored in thereconstruction server is not identical with the face information of thetarget object carried in the acquisition request, the reconstructionserver may determine that it does not store the three-dimensional imageof the target object. In this case, the reconstruction server may send aresponse to the fitting mirror, the response indicating that thethree-dimensional image of the target object is not stored in thereconstruction server. The fitting mirror may send a three-dimensionalreconstruction instruction carrying information of the target object tothe reconstruction server based on the acquisition response. Thereconstruction server may three-dimensionally reconstruct the targetobject based on the three-dimensional reconstruction instruction, andthen send an acquisition response carrying the three-dimensional imageof the target object to the fitting mirror. Based on thethree-dimensional reconstruction instruction, the process of thereconstruction server for three-dimensionally reconstructing the targetobject may refer to the corresponding process in step S412, which is notrepeated herein.

In step S506, the fitting mirror provides a virtual fitting service tothe target object based on the acquisition response.

In the embodiment of the present disclosure, after receiving theacquisition response carrying the three-dimensional image of the targetobject sent by the reconstruction server, the fitting mirror may providea virtual fitting service to the target object based on the acquisitionresponse.

It should be noted that the image quality of the three-dimensional imageof the target object carried in the acquisition response may berelatively poor. For example, when the reconstruction server acquires athree-dimensional image based on a small number of images containing thetarget object, the image quality of the three-dimensional image acquiredby the reconstruction server is relatively poor. Therefore, the fittingmirror may analyze the image quality of the three-dimensional image ofthe target object carried in the acquisition response, to determinewhether to affect the virtual fitting service provided to the targetobject. In the case that the virtual fitting service to the targetobject is affected, the fitting mirror will send out a voice messageprompting the target object to rotate in a circle. In this case, afterthe target object rotates, images of the target object may be capturedagain at different shooting angles by the fitting mirror camera. Thereconstruction server may three-dimensionally reconstruct the targetobject again based on each image. In this case, the imaging quality ofthe three-dimensional image of the target object is relatively high.

In related arts, it is usually necessary for the user to rotate by acircle in front of the fitting mirror in the event of using the fittingmirror, such that the camera installed in the dressing mirror capturesimages containing the user from different shooting angles, and thenthree-dimensionally reconstructs the image to obtain a three-dimensionalimage of the user. Therefore, acquisition of the three-dimensional imageby three-dimensional reconstruction takes long when the user uses thefitting mirror.

However, in the embodiment of the present disclosure, the first camerasare arranged in a store or mall. The reconstruction server may acquireimages of the user captured by the first cameras from different shootingangles in real time, directly three-dimensionally reconstruct the imagesafter three-dimensional reconstruction conditions are satisfied, andthen send the obtained three-dimensional image to the fitting mirror. Itis unnecessary for the user to rotate by a circle in front of thefitting mirror when the user uses the fitting mirror. In addition, thethree-dimensional image of the user can be acquired directly, withoutwaiting for three-dimensional reconstruction to obtain thethree-dimensional image, thereby improving the user experience.

It should be noted that the sequence of the steps of thethree-dimensional reconstruction method provided by the embodiment ofthe present disclosure may be appropriately adjusted, and the steps mayalso be deleted or added according to the situation. Within thetechnical scope disclosed in the present disclosure, any variations ofthe method easily derived by a person skilled in the art shall fallwithin the protection scope of the present disclosure, which is notrepeated herein.

In summary, in the three-dimensional reconstruction method according tothe embodiment of the present disclosure, the shooting angle of each ofthe images can be determined by the angle recognition model. The angleinterval corresponding to the each of the images may be determined fromthe plurality of angle intervals based on the shooting angles.Subsequently, the target object can be three-dimensionally reconstructedbased on the images corresponding to the respective angle intervals andcontaining the same target object to obtain a three-dimensional image ofthe target object. The shooting angles of images can also be acquired inthe event of acquiring the images. In the subsequent three-dimensionalreconstruction of the target object, there is no need to use additionalalgorithms to sort a plurality of first images, and the order of theplurality of first images may be acquired directly based on the shootingangles, thereby effectively reducing the calculation amount duringthree-dimensional reconstruction and improving the efficiency ofacquiring the three-dimensional image. In addition, it is unnecessaryfor the user to rotate by a circle in front of the fitting mirror whenthe user uses the fitting mirror. Further, the three-dimensional imageof the user may be acquired directly, without waiting forthree-dimensional reconstruction to obtain the three-dimensional image,thereby improving the user experience.

An embodiment of the present disclosure also provides a model trainingmethod, which is used to train the angle recognition model used in thethree-dimensional reconstruction method shown in FIG. 3, FIG. 5, or FIG.6. This model training method is applied to the training server 202 inthe model training system 200 shown in FIG. 2. The model training methodmay include:

training a plurality of times until the accuracy of classifying shootingangles of sample images in a training image set by the angle recognitionmodel reaches a predetermined threshold.

Referring to FIG. 7, FIG. 7 is a flowchart of a training processaccording to at least one embodiment of the present disclosure. Thetraining process may include the following steps.

In step S601, sample images containing a sample object captured by asecond camera and a depth map corresponding to each of the sample imagesare acquired.

In the embodiment of the present disclosure, the sample object may be aperson, an animal, or an object. The training server may use the secondcamera to capture sample images containing the sample object and a depthmap corresponding to the each of the sample images. The second cameramay be a camera including a depth camera lens, or may be a binocularcamera. For example, the second camera may be a device with a depthcamera lens, such as a Kinect device. It should be noted that the secondcamera can capture a depth map and a color map at the same time.Therefore, after the sample object is captured by the second camera, thetraining server can acquire the sample images containing the sampleobject captured by the second camera and the depth map corresponding tothe each of the sample images at the same time.

It should also be noted that the color map and the depth map acquired bythe second camera after shooting the sample object not only include thesample object, but also other background images before the sampleobject. In order to facilitate subsequent image processing, after thesecond camera shoots the sample object, the training server also needsto intercept the acquired depth map and color map, such that theintercepted sample images and the corresponding depth maps thereof onlycontain the sample object.

In step S602, a first key point and a second key point of the sampleobject are acquired from the depth map.

In the embodiment of the present disclosure, in the case that the sampleobject is a person, the first key point and the second key point of thesample object may be two shoulder joint points of the person,respectively. It should be noted that after the Kinect device capturesthe depth map containing the sample object, the Kinect device maycapture all joint points of the sample object. For example, as shown inFIG. 8, the Kinect device may capture 14 joint points of the sampleobject. At this time, the training server may acquire two shoulder jointpoints a and b of the sample object in the depth map.

In step S603, a shooting angle of the each of the sample images isdetermined based on three-dimensional coordinates of the first key pointand three-dimensional coordinates of the second key point. The shootingangle is configured to characterize a direction relative to the sampleobject when the second camera shoots the sample images.

In the embodiment of the present disclosure, an angle between a verticaldirection of a connection line between the first key point and thesecond key point and a Z-axis direction in a world coordinate system maybe determined as the shooting angle of the sample image. The Z-axisdirection in the world coordinate system is generally parallel to anoptical axis direction of the second camera. A training server maydetermine a shooting angle of the sample image based onthree-dimensional coordinates of the first key point andthree-dimensional coordinates of the second key point.

The position of a key point in the depth map may be determined fromX-axis and Y-axis components of three-dimensional coordinates of the keypoint, and a depth value of this key point may be determined from Z axiscomponents of the three-dimensional coordinates of the key point. Itshould be noted that, after acquiring the sample images and thecorresponding depth maps thereof, the training server may determinethree-dimensional coordinates of any point in the depth map.

In some embodiments of the present disclosure, determining the shootingangle of the each of the sample images based on the three-dimensionalcoordinates of the first key point and the three-dimensional coordinatesof the second key point may include: calculating a shooting angle of thesample image by using an angle calculation formula, wherein the anglecalculation formula is:

$\quad\left\{ \begin{matrix}{V_{1} = \left( {{x_{2} - x_{1}},\ {z_{2} - z_{1}}} \right)} \\{{V_{2} \times V_{1}} = 0} \\{{\alpha = {\arccos\left( \frac{V_{2} \times V_{Z}}{{V_{2}} \times {V_{Z}}} \right)}}\ ;} \\{{V_{2}} = {{V_{Z}} = 1}}\end{matrix} \right.$

wherein the three-dimensional coordinates of the first key point are(x₁, y₁, z₁), and the three-dimensional coordinates of the second keypoint are (x₂, y₂, z₂); V₁ represents a vector of a connection linebetween the first key point and the second key point in an XZ plane in aworld coordinate system; V₂ represents a unit vector perpendicular toV₁; V₂ represents a unit vector parallel to the Z axis in the worldcoordinate system; and a represents the shooting angle.

In the embodiment of the present disclosure, there are two special caseswhen the training server determines the shooting angle of the each ofthe sample images.

In the first special case, in the case that the training serverdetermines the shooting angle of the each of the sample images onlybased on the three-dimensional coordinates of the first key point andthe three-dimensional coordinates of the second key point, there may betwo sample images with the same shooting angle but different shootingdirections of the second camera. For example, the shooting angle whenthe second camera shoots the sample object in the current shootingdirection is identical with the shooting angle when the second camerashoots the sample object after the shooting direction is rotated by180°. Therefore, in order to distinguish two sample images with the sameshooting angle but different shooting directions of the second camera,after step S603, the training process may further include the followingsteps.

In step A4, whether an orientation posture of each sample objectrelative to the second camera is a back-facing orientation posture isidentified based on the sample images.

In the embodiment of the present disclosure, the training server mayidentify whether the orientation posture of the sample object relativeto the second camera is a back-facing orientation posture or a forwardorientation posture based on the sample image. When the training serveridentifies that the orientation posture of the sample object relative tothe second camera is the back-facing orientation posture, a shootingangle of the sample object needs to be corrected, and step B4 isperformed. When the training server identifies that the orientationposture of the sample object relative to the second camera is theforward orientation posture, the shooting angle of the sample objectdoes not need to be corrected.

In step B4, the shooting angle is corrected by using a correctioncalculation formula, when the orientation posture of the sample objectrelative to the second camera is the back-facing orientation posture toobtain a corrected shooting angle. The correction calculation formulais:

α1=α2+180°; in which, α1 is the corrected shooting angle; and α2 is theshooting angle before correction.

In the embodiment of the present disclosure, in order to distinguish twosample images with the same shooting angle but different shootingdirections, the training server may correct the shooting angle of thesample images in response to identifying that the orientation posture ofthe sample object relative to the second camera is the back-facingorientation posture, such that shooting angles of any two sample imagescaptured by the second camera in different shooting directions are alsodifferent.

In the second special case, in the case that the orientation posture ofthe sample object relative to the second camera is a lateral posture,the first key point and the second key point in the sample object arealmost overlapped, such that the accuracy of the training server todetermine the shooting angle of the sample image based on thethree-dimensional coordinates of the first key point and the second keypoint are relatively low. Therefore, in order to improve the accuracy ofthe shooting angle of the sample image, prior to step 603, the trainingprocess may further include the following steps.

In step A5, whether a distance between the first key point and thesecond key point is less than a distance threshold is identified.

In some embodiments of the present disclosure, as shown in FIG. 8, it isassumed that the sample object is a person. A first key point and asecond key point in the sample object may be two shoulder joint points aand b of the person, respectively, and the distance threshold may be adistance between a head joint point c and a neck joint point d. Thetraining server may calculate a distance between the two shoulder jointpoints a and b in the sample object in a depth map, and compare thisdistance with a distance threshold (that is, the distance between thehead joint point c and the neck joint point d), to identify whether thedistance between the first key point and the second key point is lessthan the distance threshold. Step B5 is performed in response toidentifying that the distance between the first key point and the secondkey point is less than the distance threshold. Step S603 is performed inresponse to identifying that the distance between the first key pointand the second key point is not less than the distance threshold.

In step B5, it is determined that the shooting angle of the sample imageis a specified angle in response to identifying that the distancebetween the first key point and the second key point is less than thedistance threshold. The specified angle is any angle within an angleinterval of a fixed range.

In the embodiment of the present disclosure, in response to identifyingthat the distance between the first key point and the second key pointis less than the distance threshold, the training server may determinethe shooting angle of the sample image as the specified angle.

Exemplarily, the specified angle may be 90° or 270°. In order todetermine the shooting angle of the sample image more accurately, thetraining server needs to determine the orientation posture of the sampleobject relative to the second camera based on the sample image, anddetermine whether the shooting angle of the sample image is 90° or 270°based on the orientation posture of the sample object relative to thesecond camera. For example, the orientation posture of the sample objectrelative to the second camera may further include: a rightwardorientation posture and a leftward orientation posture. When theorientation posture of the sample object relative to the second camerais the rightward orientation posture, the shooting angle of the sampleimage is 90°. When the orientation posture of the sample object relativeto the second camera is the leftward orientation posture, the shootingangle of the sample image is 270°.

In step S604, the sample image is inputted into a deep learning model toobtain a predicted shooting angle of the sample image, and aclassification accuracy of the shooting angle is determined according tothe shooting angle and the predicted shooting angle of the sample image.

In the embodiment of the present disclosure, the deep learning model maylearn a correspondence between the sample image and the shooting angle.After the deep learning model has completed learning the correspondencebetween the sample image and the shooting angle, the predicted shootingangle of the sample image can be obtained. After determining theclassification accuracy of the shooting angle according to the shootingangle and the predicted shooting angle of the sample image, the trainingserver may identify whether the classification accuracy is greater thana predetermined threshold. When the classification accuracy is greaterthan or equal to the predetermined threshold, the training of the sampleimage is ended, and a new sample image may be inputted into the deeplearning model later. When the classification accuracy is less than thepredetermined threshold, step S604 is repeated to input the sample imageinto the deep learning model again.

It should be noted that the angle recognition model may be obtained bythe training process of steps S601 to S604 mentioned above for manytimes. In addition, the accuracy of classifying shooting angles ofsample images in a training image set by the angle recognition modelreaches a predetermined threshold.

Exemplarily, in the above step S604, a loss value LOSS of a lossfunction may be determined according to the shooting angle and thepredicted shooting angle of the sample image. The loss value of the lossfunction may be determined by the following calculation formula:

Loss=CE(a,â)+∂·MSE(a,â).

wherein a represents the predicted shooting angle of the sample image; ârepresents a real shooting angle of the sample image; CE represents across entropy; MSE represents a mean square error; and ∂=0.8 representsa fusion coefficient of the two.

Other parameters of the deep learning model are configured as follows: aresolution of the input sample image is 224×112; a used optimizer is anAdam optimizer; and the number of iterations is 50 times. One iterationmeans that the deep learning model learns the correspondence between thesample image and the shooting angle of the sample image once.

It should be noted that the angle recognition model used in thethree-dimensional reconstruction method shown in FIG. 3, FIG. 5 or FIG.6 may be obtained by the above steps. When a target image is inputtedinto the angle recognition model, the angle model may output a shootingangle of the target image.

At least one embodiment of the present disclosure provides athree-dimensional reconstruction apparatus, which includes a processor;and

a memory for storing at least one program code executable by theprocessor. The at least one program code, when executed by theprocessor, enables the processor to be configured to:

acquire first images captured by a first camera, each of the firstimages containing a target object;

determine a shooting angle of the each of the first images by using anangle recognition model, each shooting angle being configured tocharacterize a shooting direction relative to the target object when thefirst camera shoots the first images, and the angle recognition modelbeing a model obtained by learning and training sample images andshooting angles of the sample images;

determine an angle interval corresponding to the each of the firstimages from a plurality of angle intervals included in an angle range[0, 360°) based on the shooting angle of the first image, and set thefirst image as a target image in the angle interval; and

three-dimensionally reconstructing the target object based on the targetimages in the respective angle interval to obtain a three-dimensionalimage of the target object.

At least one embodiment of the present disclosure provides athree-dimensional reconstruction apparatus. FIG. 9 shows a block diagramof a three-dimensional reconstruction apparatus according to anembodiment of the present disclosure. The three-dimensionalreconstruction apparatus 700 may be integrated in a reconstructionserver 102 in the three-dimensional reconstruction system 100 as shownin FIG. 1. The three-dimensional reconstruction apparatus 700 mayinclude:

a first acquiring module 701, configured to acquire target imagescaptured by a first camera, each of the target images being an imagecontaining a target object;

a first determining module 702, configured to determine a shooting angleof the each of the target images by using an angle recognition module,each shooting angle being configured to characterize a shootingdirection when the first camera shoots the target images, and the anglerecognition model being a model obtained by learning and training sampleimages and shooting angles of the sample images;

a second determining module 703, configured to determine an angleinterval corresponding to the each of the target images from a pluralityof angle intervals included in an angle range [0, 360°) based on theshooting angle; and

a three-dimensional reconstructing module 704, configured tothree-dimensionally reconstruct the target object based on the targetimages corresponding to the respective angle intervals and containingthe target object to obtain a three-dimensional image of the targetobject.

In some embodiments of the present disclosure, the first determiningmodule 702 is configured to input the target images into the anglerecognition model; receive angle information outputted by the anglerecognition model; and determine the angle information as the shootingangle.

FIG. 10 is a block diagram of a three-dimensional reconstructionapparatus according to another embodiment of the present disclosure. Thethree-dimensional reconstruction apparatus 700 may further include:

a marking module 705, configured to assign a tag to each of the targetimages to obtain a tag of the target image, the tag being configured tomark the target object in the target image; wherein the marking module705 marks the target image by using a target recognition algorithm;

a classifying module 706, configured to classify the target imagescontaining the target object into an image set based on the tag of theeach of the target images; and

a second acquiring module 707, configured to acquire the target imagescorresponding to the respective angle intervals and containing thetarget object, based on the image set.

In some embodiments of the present disclosure, as shown in FIG. 10, thethree-dimensional reconstruction apparatus 700 may further include:

a first identifying module 708, configured to identify whether the eachof the angle intervals corresponds to more than two target imageslocated in the image set based on the image set;

a scoring module 709, configured to quality-score the more than twotarget images located in the image set to obtain an image quality scoreof the each of the target images, in response to identifying that eachangle interval is correspondingly provided with more than two targetimages in the same image set; and

a first deleting module 710, configured to reserve the target image withthe highest image quality score, and delete the remaining target images.

FIG. 11 is a block diagram of a three-dimensional reconstructionapparatus according to yet another embodiment of the present disclosure.The three-dimensional reconstruction apparatus 700 may further include:

a second identifying module 711, configured to identify whether aresolution of each of the target images is less than a resolutionthreshold;

a second deletion module 712, configured to delete the target image inresponse to identifying that the resolution of the target image is lessthan the resolution threshold; and

a modification module 713, configured to modify the target image as animage with a specified resolution in response to identifying that theresolution of the target image is not less than the resolutionthreshold, wherein the specified resolution is greater than or equal tothe resolution threshold.

In some embodiments of the present disclosure, the three-dimensionalreconstruction module 704 is configured to: in response to identifyingthat the each of the plurality of angle intervals is provided withcorresponding target image containing the target object,three-dimensionally reconstruct the target object based on the targetimages corresponding to the each of the angle intervals to obtain athree-dimensional image of the target object.

In some embodiments of the present disclosure, the three-dimensionalreconstruction module 704 is configured to acquire a plurality of targetimages containing the target object to be reconstructed based oninformation of the target object to be reconstructed carried by athree-dimensional reconstruction instruction when the three-dimensionalreconstruction instruction is received; three-dimensionallyreconstructing the target object to be reconstructed based on the eachof the target images containing the target object to be reconstructed toobtain a three-dimensional image of the target object to bereconstructed; identify whether the three-dimensional image is anincomplete three-dimensional image; and repair the incompletethree-dimensional image in response to identifying that thethree-dimensional image is the incomplete three-dimensional image toobtain a repaired three-dimensional image.

In summary, in the three-dimensional reconstruction apparatus accordingto the embodiment of the present disclosure, the shooting angle of eachof the target images can be determined by the angle recognition model,and the angle interval corresponding to the target image can bedetermined from a plurality of angle intervals based on the shootingangle. Subsequently, the target object can be three-dimensionallyreconstructed based on the target images corresponding to the respectiveangle intervals and containing the target object to obtain thethree-dimensional image of the target object. The shooting angles of thetarget images can also be acquired when the target images are acquired.In the subsequent three-dimensional reconstruction of the target object,it is unnecessary to use additional algorithms to sort a plurality oftarget images, and the order of the plurality of target images can beacquired directly based on the shooting angles, thereby effectivelyreducing the calculation amount during the three-dimensionalreconstruction and improving the efficiency of acquiring thethree-dimensional image.

An embodiment of the present disclosure further provides a modeltraining apparatus. The model training device may be integrated in thetraining server 202 in the model training system 200 shown in FIG. 2.The model training apparatus is configured to train the anglerecognition model used in the three-dimensional reconstruction methodshown in FIG. 3, FIG. 5, or FIG. 6. The model training apparatus mayinclude:

a training module, configured to train a plurality of times until theaccuracy of classifying shooting angles of sample images in a trainingimage set by the angle recognition model reaches a predeterminedthreshold. This training may include:

acquiring sample images containing a sample object captured by a secondcamera and a depth map corresponding to each of the sample images;

acquiring a first key point and a second key point of the sample objectfrom the depth map;

determining a shooting angle of the each of the sample images based onthree-dimensional coordinates of the first key point andthree-dimensional coordinates of the second key point, the shootingangle being configured to characterize a direction relative to thesample object when the second camera shoots the sample images; and

inputting the sample image into a deep learning model to obtain apredicted shooting angle of the sample image, and determining aclassification accuracy of the shooting angle according to the shootingangle and the predicted shooting angle of the sample image.

In some embodiments of the present disclosure, determining the shootingangle of the each of the sample images based on the three-dimensionalcoordinates of the first key point and the three-dimensional coordinatesof the second key point includes:

calculating a shooting angle of the sample image by using an anglecalculation formula, wherein the angle calculation formula is:

$\quad\left\{ \begin{matrix}{V_{1} = \left( {{x_{2} - x_{1}},\ {z_{2} - z_{1}}} \right)} \\{{V_{2} \times V_{1}} = 0} \\{{\alpha = {\arccos\left( \frac{V_{2} \times V_{Z}}{{V_{2}} \times {V_{Z}}} \right)}}\ ;} \\{{V_{2}} = {{V_{Z}} = 1}}\end{matrix} \right.$

wherein the three-dimensional coordinates of the first key point are(x₁, y₁, z₁), and the three-dimensional coordinates of the second keypoint are (x₂, y₂, z₂); V₁ represents a vector of a connection linebetween the first key point and the second key point in an XZ plane in aworld coordinate system; V₂ represents a unit vector perpendicular toV₁; V₂ represents a unit vector parallel to the Z axis in the worldcoordinate system; and a represents the shooting angle.

In some embodiments of the present disclosure, after determining theshooting angle of the each of the sample images based on thethree-dimensional coordinates of the first key point and thethree-dimensional coordinates of the second key point, the trainingprocess further includes:

identifying, based on the sample image, whether an orientation postureof the sample object relative to the second camera is a back-facingorientation posture; and correcting the shooting angle by using acorrection calculation formula, when the orientation posture of thesample object relative to the second camera is the back-facingorientation posture to obtain a corrected shooting angle, wherein thecorrection calculation formula is: α1=α2+180°; in which, α1 is thecorrected shooting angle; and α2 is the shooting angle beforecorrection.

In some embodiments of the present disclosure, prior to determining theshooting angle of the each of the sample images based on thethree-dimensional coordinates of the first key point and thethree-dimensional coordinates of the second key point, the trainingprocess further includes:

identifying whether a distance between the first key point and thesecond key point is less than a distance threshold; and determining thatthe shooting angle of the sample image is a specified angle in responseto identifying that the distance between the first key point and thesecond key point is less than the distance threshold, wherein thespecified angle is any angle within an angle interval of a fixed range.

At least one embodiment of the present disclosure also provides athree-dimensional reconstruction system, which may include areconstruction server and a first camera. The structure of thethree-dimensional reconstruction system may refer to the structure shownin the three-dimensional reconstruction system shown in FIG. 1. Thereconstruction server may include a three-dimensional reconstructionapparatus 700 shown in FIG. 9, FIG. 10 or FIG. 11.

In some embodiments of the present disclosure, the three-dimensionalreconstruction system includes a fitting mirror. The fitting mirror isconfigured to, when a target object is detected, send an acquisitionrequest to the reconstruction server, the acquisition request beingconfigured to request to acquire the three-dimensional image of thetarget object from the reconstruction server, and the acquisitionrequest carrying information of the target object. The reconstructionserver is configured to send an acquisition response to the fittingmirror based on the information of the target object, the acquisitionresponse carrying the three-dimensional image of the target object.

At least one embodiment of the present disclosure also provides a modeltraining system. The model training system may include a training serverand a second camera. The structure of the model training system mayrefer to the structure shown in the model training system shown in FIG.2. The training server may include the training module shown in theforegoing embodiment.

A person skilled in the art may clearly understand that, for theconvenience and brevity of the description, the working process of theabove-described system and apparatus may refer to the correspondingprocess in the foregoing method embodiments, and details are notdescribed herein again.

At least embodiment of the present disclosure further provides anon-volatile computer-readable storage medium configured to store atleast one code instruction therein. The at least one code instruction,when executed by a processor, enables the processor to perform thethree-dimensional reconstruction method described in the aboveembodiments, e.g., the three-dimensional reconstruction method shown inFIG. 3, FIG. 5, or FIG. 6.

At least embodiment of the present disclosure further provides acomputer-readable storage medium which is a non-volatile storage mediumconfigured to store at least one code instruction therein, The at leastone code instruction, when executed by a processor, enables theprocessor to perform the model training method described in the aboveembodiments, e.g., the training process shown in FIG. 7.

The terms “first” and “second” used in the present disclosure are merelyused to describe but not denote or imply any relative importance. Theterm “a plurality of” means two or more, unless otherwise expresslyprovided.

It should be understood by a person ordinary skill in the art that, allor part of the steps of the above embodiments may be implemented byhardware, or by programs that give instructions to respective hardware.The programs may be stored in a computer-readable storage medium whichmay be a read-only memory, a magnetic disk or an optical disk or thelike.

The above are just the optional embodiments of the present disclosure,which does not limit the present disclosure. Any modifications,equivalent replacements and improvements made within the spirits andprinciples of the present disclosure shall all fall in the protectionscope of the present disclosure.

What is claimed is:
 1. A three-dimensional reconstruction method,comprising: acquiring first images captured by a first camera, each ofthe first images being an image containing a target object; determininga shooting angle of the each of the first images, the shooting anglebeing configured to characterize a shooting direction relative to thetarget object when the first camera shoots the first images; determiningan angle interval corresponding to the each of the first images from aplurality of angle intervals included in an angle range [0, 360°) basedon the shooting angle, and setting the first image as a target image inthe angle interval; and three-dimensionally reconstructing the targetobject based on the target images in the respective angle intervals toobtain a three-dimensional image of the target object.
 2. The methodaccording to claim 1, wherein determining the shooting angle of the eachof the first images comprises: inputting the first image into an anglerecognition model; receiving angle information outputted by the anglerecognition model; and determining the angle information as the shootingangle; wherein the angle recognition model is a model obtained bylearning and training sample images and shooting angles of the sampleimages.
 3. The method according to claim 1, wherein after acquiring thefirst images captured by the first camera, the method further comprises:assigning a tag to the each of the first images, the tag beingconfigured to mark the target object in the first image; and classifyingthe first images containing the target object into an image set based onthe tag of the each of the first images; prior to three-dimensionallyreconstructing the target object based on the target images in therespective angle intervals to obtain the three-dimensional image of thetarget object, the method further comprises: acquiring the first imagescorresponding to the respective angle intervals based on the image set.4. The method according to claim 3, wherein after determining the angleinterval corresponding to the each of the first images from theplurality of angle intervals included in the angle range [0, 360°) basedon the shooting angle, the method further comprises: identifying, foreach of the plurality of angle intervals, whether the image setcomprises a plurality of first images corresponding to the angleinterval; quality-scoring the plurality of first images to obtain animage quality score of the each of the first images in the plurality offirst images, in response to identifying that the image set comprisesthe plurality of first images corresponding to the angle interval; andreserving the first image with the highest image quality score, anddeleting the remaining first images.
 5. The method according to claim 1,wherein after acquiring the first images captured by the first camera,the method further comprises: identifying whether a resolution of theeach of the first images is less than a resolution threshold; deletingthe first image in response to identifying that the resolution of thefirst image is less than the resolution threshold; and modifying thefirst image as an image with a specified resolution in response toidentifying that the resolution of the first image is not less than theresolution threshold, wherein the specified resolution is greater thanor equal to the resolution threshold.
 6. The method according to claim1, wherein three-dimensionally reconstructing the target object based onthe target images in the respective angle intervals to obtain thethree-dimensional image of the target object comprises:three-dimensionally reconstructing, when the each of the plurality ofangle intervals is provided with a corresponding target image, thetarget object based on the target image in each of the angle intervalsto obtain a three-dimensional image of the target object.
 7. The methodaccording to claim 1, wherein three-dimensionally reconstructing thetarget object based on the target images in the respective angleintervals to obtain the three-dimensional image of the target objectcomprises: acquiring, when the three-dimensional reconstructioninstruction is received, a plurality of first images based oninformation of the target object carried by a three-dimensionalreconstruction instruction; determining the corresponding angleintervals of the plurality of first images based on the plurality offirst images, and determining the plurality of first images as targetimages in the corresponding angle intervals; three-dimensionallyreconstructing the target object based on the target images in therespective angle intervals to obtain a three-dimensional image of thetarget object; and identifying whether the three-dimensional image is anincomplete three-dimensional image; and repairing the incompletethree-dimensional image in response to identifying that thethree-dimensional image is the incomplete three-dimensional image toobtain a repaired three-dimensional image.
 8. A three-dimensionalreconstruction apparatus, comprising: a processor; and a memory forstoring at least one program code executable by the processor, whereinthe at least one program code, when executed by the processor, enablesthe processor to be configured to: acquire first images captured by afirst camera, each of the first images being an image containing atarget object; determine a shooting angle of the each of the firstimages by using an angle recognition model, the shooting angle beingconfigured to characterize a shooting direction relative to the targetobject when the first camera shoots the first images, and the anglerecognition model being a model obtained by learning and training sampleimages and shooting angles of the sample images; determine an angleinterval corresponding to the each of the first images from a pluralityof angle intervals included in an angle range [0, 360°) based on theshooting angle of the first image, and set the first image as a targetimage in the angle interval; and three-dimensionally reconstructing thetarget object based on the target images in the respective angleintervals to obtain a three-dimensional image of the target object.
 9. Athree-dimensional reconstruction system, comprising a reconstructionserver and a first camera, wherein the reconstruction server comprisesthe three-dimensional reconstruction apparatus as defined in claim 8.10. The system according to claim 9, further comprising a fittingmirror, wherein the fitting mirror is configured to, when a targetobject is detected, send an acquisition request to the reconstructionserver, the acquisition request carrying information of the targetobject; and the reconstruction server is configured to send anacquisition response to the fitting mirror based on the information ofthe target object, the acquisition response carrying a three-dimensionalimage of the target object.
 11. A model training method, configured totrain an angle recognition model, the method comprising: training aplurality of times until the accuracy of classifying shooting angles ofsample images in a training image set by the angle recognition modelreaches a predetermined threshold, wherein the training comprises:acquiring sample images containing a sample object captured by a secondcamera and a depth map corresponding to each of the sample images;acquiring a first key point and a second key point of the sample objectfrom the depth map; determining a shooting angle of the each of thesample images based on three-dimensional coordinates of the first keypoint and three-dimensional coordinates of the second key point, theshooting angle being configured to characterize a direction relative tothe sample object when the second camera shoots the sample images; andinputting the sample image into a deep learning model to obtain apredicted shooting angle of the sample image, and determining aclassification accuracy of the shooting angle according to the shootingangle and the predicted shooting angle of the sample image.
 12. Themethod according to claim 11, wherein determining the shooting angle ofthe each of the sample images based on the three-dimensional coordinatesof the first key point and the three-dimensional coordinates of thesecond key point comprises: calculating a shooting angle of the sampleimage by using an angle calculation formula, wherein the anglecalculation formula is: $\quad\left\{ \begin{matrix}{V_{1} = \left( {{x_{2} - x_{1}},\ {z_{2} - z_{1}}} \right)} \\{{V_{2} \times V_{1}} = 0} \\{{\alpha = {\arccos\left( \frac{V_{2} \times V_{Z}}{{V_{2}} \times {V_{Z}}} \right)}}\ ;} \\{{V_{2}} = {{V_{Z}} = 1}}\end{matrix} \right.$ wherein the three-dimensional coordinates of thefirst key point are (x1, y1, z1), and the three-dimensional coordinatesof the second key point are (x2, y2, z2); V1 represents a vector of aconnection line between the first key point and the second key point inan XZ plane in a world coordinate system; V2 represents a unit vectorperpendicular to V1; VZ represents a unit vector parallel to the Z axisin the world coordinate system; and α represents the shooting angle. 13.The method according to claim 11, wherein after determining the shootingangle of the each of the sample images based on the three-dimensionalcoordinates of the first key point and the three-dimensional coordinatesof the second key point, the training further comprises: identifying,based on the sample image, whether an orientation posture of the sampleobject relative to the second camera is a back-facing orientationposture; and correcting the shooting angle by using a correctioncalculation formula, when the orientation posture of the sample objectrelative to the second camera is the back-facing orientation posture toobtain a corrected shooting angle, wherein the correction calculationformula is:α1=α2+180°; wherein α1 is the corrected shooting angle; and α2 is theshooting angle before correction.
 14. The method according to claim 11,wherein prior to determining the shooting angle of each of the sampleimages based on the three-dimensional coordinates of the first key pointand the three-dimensional coordinates of the second key point, thetraining further comprises: identifying whether a distance between thefirst key point and the second key point is less than a distancethreshold; and determining that the shooting angle of the sample imageis a specified angle in the case that the distance between the first keypoint and the second key point is less than the distance threshold,wherein the specified angle is any angle within an angle interval of afixed range.
 15. A non-volatile computer-readable storage medium storingat least one code instruction, wherein the at least one codeinstruction, when executed by a processor, enables the processor toperform the three-dimensional reconstruction method as defined inclaim
 1. 16. The apparatus according to claim 8, wherein in order todetermine the shooting angle of the each of the first images, the atleast one program code, when executed by the processor, enables theprocessor to be configured to: input the first image into an anglerecognition model; receive angle information outputted by the anglerecognition model; and determine the angle information as the shootingangle; wherein the angle recognition model is a model obtained bylearning and training sample images and shooting angles of the sampleimages.
 17. The apparatus according to claim 8, wherein the at least oneprogram code, when executed by the processor, enables the processor tobe further configured to: assign a tag to the each of the first images,the tag being configured to mark the target object in the first image;and classify the first images containing the target object into an imageset based on the tag of the each of the first images; prior tothree-dimensionally reconstructing the target object based on the targetimages in the respective angle intervals to obtain the three-dimensionalimage of the target object, the at least one program code, when executedby the processor, enables the processor to be configured to: acquire thefirst images corresponding to the respective angle intervals based onthe image set.
 18. The apparatus according to claim 17, wherein the atleast one program code, when executed by the processor, enables theprocessor to be further configured to: identify, for each of theplurality of angle intervals, whether the image set comprises aplurality of first images corresponding to the angle interval;quality-score the plurality of first images to obtain an image qualityscore of the each of the first images in the plurality of first images,in response to identifying that the image set comprises the plurality offirst images corresponding to the angle interval; and reserve the firstimage with the highest image quality score, and deleting the remainingfirst images.
 19. The apparatus according to claim 8, wherein the atleast one program code, when executed by the processor, enables theprocessor to be further configured to: identify whether a resolution ofthe each of the first images is less than a resolution threshold; deletethe first image in response to identifying that the resolution of thefirst image is less than the resolution threshold; and modify the firstimage as an image with a specified resolution in response to identifyingthat the resolution of the first image is not less than the resolutionthreshold, wherein the specified resolution is greater than or equal tothe resolution threshold.
 20. The apparatus according to claim 8,wherein in order to three-dimensionally reconstruct the target objectbased on the target images in the respective angle intervals to obtainthe three-dimensional image of the target object, the at least oneprogram code, when executed by the processor, enables the processor tobe configured to: three-dimensionally reconstruct, when the each of theplurality of angle intervals is provided with a corresponding targetimage, the target object based on the target image in each of the angleintervals to obtain a three-dimensional image of the target object.